Search CORE

UCL Discovery

arXiv.org e-Print Archive

Composite structural motifs of binding sites for delineating biological functions of proteins

Author: A Bairoch
A Fiorillo
A Rausell
A Stark
AC Joerger
AC Wallace
AG Murzin
Akira R. Kinjo
AM Schnoes
AR Kinjo
AR Kinjo
AR Kinjo
B Bollobás
B Dasgupta
B Louie
B Rost
BH Dessailly
C Branden
C Winter
CV Robinson
D Petrey
DJ Schuller
DM Chipman
E Krissinel
E Toyota
FP Davis
FP Davis
GM Santos
H Berman
H Kettenberger
Haruki Nakamura
I Friedberg
J Janin
J Shi
J Westbrook
JI Yeh
K Chen
K Henrick
K Kinoshita
K Kinoshita
K Kinoshita
K Okazaki
K Stenberg
L Xie
M Bashton
M Brylinski
M Kitayner
M Levitt
M Moertl
M Nardini
M Tyagi
M Yang
N Nagano
N Tuncbag
N Tuncbag
N Zhao
ND Gold
O Keskin
O Keskin
OC Redfern
Ozlem Keskin
P Cramer
P Shannon
PD Pawelek
R Koike
R Koike
R Rentzsch
R Sinha
RR Thangudu
S Kadono
SF Altschul
T Amemiya
T Kawabata
T Kawabata
TA Holland
TC Terwilliger
Y Loewenstein
Z Aung
ZX Xia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs which represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures.Comment: 34 pages, 7 figure

CiteSeerX

BEAR (Buckingham E-Archive of Research)

Structure-guided selection of specificity determining positions in the human kinome

Author: A Rausell
BY Chen
C Schalon
D Huang
D Kuhn
DH Bryant
F Milletti
F Pazos
GE Crooks
I Halperin
I Steinwart
IT Jolliffe
J Blanc
J Mok
JA Bikker
JA Capra
L Xie
L Xing
Lydia E. Kavraki
M Menke
M Moll
Mark Moll
MC Heinrich
MW Karaman
O Weigert
OC Redfern
OV Kalinina
Paul W. Finn
PW Finn
RC de Melo-Minardi
RD Finn
RK Kancha
S Chakrabarti
S Redaelli
SB Hari
SL Kinnings
T Liu
T Trowe
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/08/2016
Field of study

Background: The human kinome contains many important drug targets. It is well-known that inhibitors of protein kinases bind with very different selectivity profiles. This is also the case for inhibitors of many other protein families. The increased availability of protein 3D structures has provided much information on the structural variation within a given protein family. However, the relationship between structural variations and binding specificity is complex and incompletely understood. We have developed a structural bioinformatics approach which provides an analysis of key determinants of binding selectivity as a tool to enhance the rational design of drugs with a specific selectivity profile. Results: We propose a greedy algorithm that computes a subset of residue positions in a multiple sequence alignment such that structural and chemical variation in those positions helps explain known binding affinities. By providing this information, the main purpose of the algorithm is to provide experimentalists with possible insights into how the selectivity profile of certain inhibitors is achieved, which is useful for lead optimization. In addition, the algorithm can also be used to predict binding affinities for structures whose affinity for a given inhibitor is unknown. The algorithm’s performance is demonstrated using an extensive dataset for the human kinome. Conclusion: We show that the binding affinity of 38 different kinase inhibitors can be explained with consistently high precision and accuracy using the variation of at most six residue positions in the kinome binding site. We show for several inhibitors that we are able to identify residues that are known to be functionally important

DSpace at Rice University

BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

Author: A Andreeva
A Bateman
A Godzik
A Gutteridge
A Kouranov
A Shulman-Peleg
AC Wallace
AG Murzin
BE Engelhardt
Bing Xiong
C Chothia
CA Orengo
D Lee
D Pal
David L Burk
DF Veber
GJ Kleywegt
GP Brady
HM Berman
Hualiang Jiang
J Blaszczyk
J Soding
Jie Wu
Jingkang Shen
K Lundstrom
K Yeturu
L Holm
L Xie
M Ashburner
M Brylinski
Mengzhu Xue
MP Liang
ND Gold
OC Redfern
P Willett
RA Laskowski
RA Laskowski
RB Russell
S Schmitt
SF Altschul
SG Buchanan
T Fawcett
T Hamelryck
TA Binkowski
WA Warr
WR Pearson
XY Jiang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.</p

Queen's University Belfast Research Portal

BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

Author: Bing Xiong
Jie Wu
David L Burk
Mengzhu Xue
Hualiang Jiang
Jingkang Shen
WA Warr
A Kouranov
A Godzik
OC Redfern
SG Buchanan
K Lundstrom
DF Veber
D Lee
SF Altschul
A Bateman
BE Engelhardt
J Soding
C Chothia
L Holm
AG Murzin
CA Orengo
A Andreeva
TA Binkowski
GJ Kleywegt
RA Laskowski
RB Russell
S Schmitt
A Shulman-Peleg
AC Wallace
T Hamelryck
M Ashburner
P Willett
HM Berman
GP Brady
WR Pearson
A Gutteridge
T Fawcett
ND Gold
J Blaszczyk
K Yeturu
RA Laskowski
L Xie
MP Liang
M Brylinski
XY Jiang
D Pal
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Southampton (e-Prints Soton)

Online Research Database In Technology

Accurate Protein Structure Annotation through Competitive Diffusion of Enzymatic Functions over a Network of Local Evolutionary Similarities

Author: A Arakaki
A Ribes-Zamora
A Vazquez
AD Wilkins
AM Schnoes
Andreas Martin Lisewski
B Adamcsek
BE Engelhardt
Christos Ouzounis
CT Porter
D Barrell
D Warde-Farley
D Zhou
DE Almonacid
DM Kristensen
DS Glazer
E Levy
E Nabieva
EM Marcotte
Eric Venner
F Baameur
F Ferre
F Glaser
F Pazos
G Bader
GJ Rodriguez
H Hishigaki
H Kobayashi
H Shin
H Yao
HJ Atkinson
HN Chua
I Friedberg
I Lee
I Lee
I Mihalek
I Mihalek
J Byun
J Chandonia
J Rhee
J Song
J Westbrook
JA Capra
JD Watson
JJ Mukherjee
K Krisch
K Tsuda
K Wang
L Holm
L Jaroszewski
L Rajagopalan
LH Greene
M Deng
M Larkin
ME Sowa
ME Sowa
MEJ Newman
MI Sadowski
MK Ross
MM Bonde
N Furnham
N Nariai
ND Gold
O Lichtarge
O Lichtarge
O Lichtarge
OC Redfern
OC Redfern
Olivier Lichtarge
P Gu
P Hu
PA Alexander
PC Wu
PF Gherardini
R Onrust
R Sharan
R She
R. Matthew Ward
RA Chiang
RA Laskowski
RA Laskowski
RA Laskowski
RM Ward
S Altschul
S Erdin
S Hennig
S Madabushi
S Madabushi
SB Pandit
SD Copley
SE Brenner
SE Brenner
Serkan Erdin
SF Altschul
Shivas R. Amin
SK Shenoy
SR Collins
SR Gill
T Hsiao
V van Noort
X Quan
Y Qi
YY Tseng
Publication venue: Public Library of Science
Publication date: 13/12/2010
Field of study

High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks

Structural genomics is the largest contributor of novel structural leverage

Author: A Andreeva
A Bhattacharya
A Grant
A Harrison
Adam Godzik
AG Murzin
Andras Fiser
Andrei Kouranov
Burkhard Rost
C Chothia
C Sander
C Yeats
CA Orengo
Christine Orengo
CM Fraser-Liggett
Gaetano T. Montelione
GW Tyson
H Berman
HM Berman
IYY Koh
J Kopp
J Liu
J Liu
J Liu
J Liu
J Moult
J Moult
JC Norvell
JD Watson
Jinfeng Liu
JM Chandonia
John K. Everett
L Chen
Lukasz Jaroszewski
M Gerstein
M Levitt
MA Marti-Renom
MA Marti-Renom
N Fernandez-Fuentes
OC Redfern
PE Bourne
R Apweiler
R Nair
Rajesh Nair
RL Marsden
S Yooseph
Ta-Tsen Soong
Thomas B. Acton
U Pieper
U Pieper
Publication venue: Springer Netherlands
Publication date: 01/01/2009
Field of study

The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database

eScholarship - University of California

UCL Discovery

A novel method to compare protein structures using local descriptors

Abstract Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at <url>http://bioexploratorium.pl/EP/DEDAL</url>.</p

A Measure of the Promiscuity of Proteins and Characteristics of Residues in the Vicinity of the Catalytic Site That Regulate Promiscuity

Promiscuity, the basis for the evolution of new functions through ‘tinkering’ of residues in the vicinity of the catalytic site, is yet to be quantitatively defined. We present a computational method Promiscuity Indices Estimator (PROMISE) - based on signatures derived from the spatial and electrostatic properties of the catalytic residues, to estimate the promiscuity (PromIndex) of proteins with known active site residues and 3D structure. PromIndex reflects the number of different active site signatures that have congruent matches in close proximity of its native catalytic site, the quality of the matches and difference in the enzymatic activity. Promiscuity in proteins is observed to follow a lognormal distribution (μ = 0.28, σ = 1.1 reduced chi-square = 3.0E-5). The PROMISE predicted promiscuous functions in any protein can serve as the starting point for directed evolution experiments. PROMISE ranks carboxypeptidase A and ribonuclease A amongst the more promiscuous proteins. We have also investigated the properties of the residues in the vicinity of the catalytic site that regulates its promiscuity. Linear regression establishes a weak correlation (R2∼0.1) between certain properties of the residues (charge, polar, etc) in the neighborhood of the catalytic residues and PromIndex. A stronger relationship states that most proteins with high promiscuity have high percentages of charged and polar residues within a radius of 3 Å of the catalytic site, which is validated using one-tailed hypothesis tests (P-values∼0.05). Since it is known that these characteristics are key factors in catalysis, their relationship with the promiscuity index cross validates the methodology of PROMISE

CiteSeerX